Goto

Collaborating Authors

 fast federated optimization


FedSplit: an algorithmic framework for fast federated optimization

Neural Information Processing Systems

Motivated by federated learning, we consider the hub-and-spoke model of distributed optimization in which a central authority coordinates the computation of a solution among many agents while limiting communication. We first study some past procedures for federated optimization, and show that their fixed points need not correspond to stationary points of the original optimization problem, even in simple convex settings with deterministic updates. In order to remedy these issues, we introduce FedSplit, a class of algorithms based on operator splitting procedures for solving distributed convex minimization with additive structure. We prove that these procedures have the correct fixed points, corresponding to optima of the original optimization problem, and we characterize their convergence rates under different settings. Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities. We complement our theory with some experiments that demonstrate the benefits of our methods in practice.


Review for NeurIPS paper: FedSplit: an algorithmic framework for fast federated optimization

Neural Information Processing Systems

Weaknesses: Main criticism: 1) The paper claims two main contributions, one of which is "The first contribution of this paper is to analyze some past procedures, and show that even in the favorable setting of deterministic updates (i.e., no stochastic approximation used), these methods typically fail to preserve solutions of the original optimization problem as fixed points " I believe the text above is misleading. In fact, it was already well known for the "past procedures" to not have the correct fixed points; one alternative approach to deal with such an issue was to incorporate the "drift"; see https://arxiv.org/abs/1910.06378 for example. Therefore, believe it would be more appropriate to not claim the contribution for showing the wrong fixed point of the local algorithms. Specifically, in strongly convex case (same arguments apply for weakly convex), the communication complexity of FedSplit is O(sqrt(kappa)log(1/epsilon)), which is identical to the communication complexity of AGD. In fact, AGD is favorable (in terms of the rate) as is requires a single gradient evaluation instead of evaluating the prox with high enough precision so that the inexactness does not drive the rate.


FedSplit: an algorithmic framework for fast federated optimization

Neural Information Processing Systems

Motivated by federated learning, we consider the hub-and-spoke model of distributed optimization in which a central authority coordinates the computation of a solution among many agents while limiting communication. We first study some past procedures for federated optimization, and show that their fixed points need not correspond to stationary points of the original optimization problem, even in simple convex settings with deterministic updates. In order to remedy these issues, we introduce FedSplit, a class of algorithms based on operator splitting procedures for solving distributed convex minimization with additive structure. We prove that these procedures have the correct fixed points, corresponding to optima of the original optimization problem, and we characterize their convergence rates under different settings. Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities.